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MULTIPLY-SUBSTITUTED PROTEASE VARIANTS 

Retated Applications 

The present application is a continuation-in-part application of United States 
5 Patent Application 08/956,323. filed October 23, 1998. United States Patent 

Application 08/956.564. filed October 23. 1998, and United States Patent Application 
08/956,324 filed October 23. 1998. all of which are hereby incorporated herein in 
their entirety. 

10 Background of the Invention 

Serine proteases are a subgroup of carbonyl hydrolases. They comprise a 
diverse class of enzymes having a wide range of specificities and biological 
functions. Stroud, R. Sci. Amer. . 131:74-88. Despite their functional diversity, the 
catalytic machinery of serine proteases has been approached by at least two 

1 5 genetically distinct families of enzymes: 1 ) the subtilisins and 2) the mammalian 
chymotrypsin-related and homologous bacterial serine proteases (e.g., trypsin and 
S. gresius trypsin). These two families of serine proteases show remarkably similar 
mechanisms of catalysis. Kraut. J. (1977), Annu. Rev. Biochem. . 46:331-358. 
Furthermore, although the primary structure is unrelated, the tertiary structure of 

20 these two enzyme families bring together a conserved catalytic triad of amino acids 
consisting of serine, histidine and aspartate. 

Subtilisins are serine proteases (approx. MW 27.500) which are secreted in 
large amounts from a wide variety of Bacillus species and other microorganisms. 
The protein sequence of subtilisin has been determined from at least nine different 

25 species of Bacillus. Markland, F.S., et al. (1983), Hoppe-Sevler's Z. Physiol. Chem. , 
364:1537-1540. The three-dimensional crystallographic structure of subtilisins from 
Bacillus amyloliquefaciens, Bacillus licheniforimis and several natural variants of B. 
lentus have been reported. These studies indicate that although subtilisin is 
genetically unrelated to the mammaiian serine proteases, it has a similar active site 

30 structure. The x-ray crystal structures of subtilisin containing covalently bound 
peptide inhibitors (Robertus, J.D., et al. (1972), Biochemistry . 11:2439-2449) or 
product complexes (Robertus, J.D., et al. (1976), J. Biol. Chem. , 251:1097-1103) 
have also provided information regarding the active site and putative substrate 
binding cleft of subtilisin. In addition, a large number of kinetic and chemical 
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modification studies have been reported for subtilisin ; Svendsen. B. (1976), 
Carlsberq Res. Commun. . 41:237-291; Markland, F.S. Jd.) as well as at least one 
report wherein the side chain of methionine at residue 222 of subtilisin was 
converted by hydrogen peroxide to methionine-sulfoxide (Stauffer, D.C., et al. 
5 (1965), J. Biol. Chem. . 244:5333-5338) and extensive site-specific mutagenesis has 
been carried out (Wells and Estell (1988) TIBS 13:291-297) 

Summary of the Invention 

It is an object herein to provide a protease variant containing a substitution of 
10 an amino acid at one or more residue positions corresponding to residue positions 
selected from the group consisting of 62, 212, 230, 232, 252 and 257 of Bacillus 
amyloliquefaciens subtilisin. 

While any combination of the above listed amino acid substitutions may be 
employed, the preferred protease variant enzymes of the present invention comprise 
1 5 the substitution of amino acid residues in the following combinations. All of the 
residue positions correspond to positions of Bacillus amyloliquefaciens subtilisin: 
(1) a protease variant including substitutions of the amino acid residues at 
position 62 and at one or more of the following positions 103, 104, 109, 159, 213, 
232, 236, 245, 248 and 252; 
20 (2) a protease variant including substitutions of the amino acid residues at 

position 212 and at one or more of the following positions 12, 98. 102. 103. 104. 159, 
232, 236, 245, 248 and 252; 

(3) a protease variant including substitutions of the amino acid residues at 
position 230 and at one or more of the following positions 68, 103, 104 : 159. 232, 

25 236 and 245; 

(4) a protease variant including substitutions of the amino acid residues at 
position 232 and at one or more of the following positions: 1, 9, 12, 61, 62, 68, 76, 
97, 98, 101, 102, 103, 104, 109, 130, 131, 159, 183, 185, 205. 209, 210, 212, 213, 
217, 230, 236, 245, 248, 252. 257, 260, 270 and 275; 

30 (5) a protease variant including substitutions of the amino acid residues at 

position 232 and at one or more of the following positions 103. 104, 236 and 245; 

(6) a protease variant including substitutions of the amino acid residues at 
position 232 and 103 and at one or more of the following positions 1, 9, 12, 61, 62, 
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68, 76, 97, 98, 101, 102, 103, 104, 109, 130, 131, 159, 183, 185, 205, 209. 210, 212, 
213, 217, 230, 236, 245. 248. 252. 257, 260. 270 and 275; 

(7) a protease variant including substitutions of the amino acid residues at 
position 232 and 104 and at one or more of the following positions 1,9, 12, 61, 62. 

5 68, 76, 97, 98. 101. 102, 103. 104, 109. 130. 131, 159, 183. 185. 205, 209, 210, 212. 
213, 217, 230, 236, 245, 248, 252. 257, 260, 270 and 275; 

(8) a protease variant including substitutions of the amino acid residues at 
position 232 and 236 and at one or more of the following positions 1. 9. 12. 61, 62, 
68. 76, 97, 98, 101, 102. 103, 104. 109, 130, 131, 159. 183, 185. 205, 209, 210, 212, 

10 213. 217. 230, 236, 245. 248, 252. 257, 260, 270 and 275; 

(9) a protease variant including substitutions of the amino acid residues at 
position 232 and 245 and at one or more of the following positions 1.9, 12, 61, 62, 
68. 76. 97, 98, 101, 102 : 103, 104. 109, 130, 131, 159, 183. 185. 205, 209, 210, 212, 
213, 217, 230. 236, 245, 248, 252. 257, 260, 270 and 275; 

15 (10) a protease variant including substitutions of the amino acid residues at 

position 232, 103, 104, 236 and 245 and at one or more of the following positions 1, 
9, 12. 61, 62, 68, 76. 97, 98, 101. 102, 103, 104, 109, 130, 131, 159, 183, 185, 205, 
209, 210, 212, 213, 217, 230, 236, 245, 248, 252. 257, 260, 270 and 275; 

(1 1) a protease variant including substitutions of the amino acid residues at 
20 position 252 and at one or more of the following positions 1, 9, 12, 61, 62, 68; 97, 98, 

101. 102. 103, 104, 109, 130. 131, 159, 183. 185. 210, 212. 213, 217, 232. 236, 245, 
248 and 270; 

(12) a protease variant including substitutions of the amino acid residues at 
position 252 and at one or more of the following positions 103, 104. 236 and 245; 

25 (13) a protease variant including substitutions of the amino acid residues at 

positions 252 and 103 and at one or more of the following positions 1, 9, 12. 61, 62, 
68, 97, 98, 101, 102, 103, 104, 109, 130, 131, 159, 183. 185. 210, 212, 213. 217, 
232, 236, 245, 248 and 270; 

(14) a protease variant including substitutions of the amino acid residues at 
30 positions 252 and 104 and at one or more of the following positions 1, 9, 12, 61, 62, 

68, 97, 98, 101, 102, 103, 104, 109. 130, 131, 159, 183, 185, 210, 212, 213, 217, 
232, 236, 245, 248 and 270; 

(15) a protease variant including substitutions of the amino acid residues at 
positions 252 and 236 and at one or more of the following positions 1, 9, 12, 61, 62. 
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68, 97, 98, 101. 102, 103, 104. 109, 130, 131, 159. 183, 185, 210, 212, 213, 217, 
232. 236, 245. 248 and 270; 

(16) a protease variant including substitutions of the amino acid residues at 
positions 252 and 245 and at one or more of the following positions 1,9. 12. 61, 62. 

5 68, 97, 98, 101, 102. 103, 104, 109, 130, 131. 159, 183, 185, 210, 212, 213, 217, 
232, 236. 245, 248 and 270; 

(17) a protease variant including substitutions of the amino acid residues at 
positions 252, 103, 104, 236 and 245 and at one or more of the following positions 1, 
9, 12, 61, 62, 68. 97, 98, 101, 102, 103. 104, 109. 130, 131, 159, 183, 185, 210, 212, 

10 213. 217, 232, 236, 245, 248 and 270: and 

(18) a protease variant including substitutions of the amino acid residues at 
position 257 and at one or more of the following positions 68, 103, 104, 205, 209, 
210. 232. 236. 245 and 275. More preferred protease variants are substitution sets 
selected from the group consisting of residue positions corresponding to positions in 

15 Table 1 of Bacillus amyloliquefaciens subtilisin: 
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Most preferred protease variants are substitution sets selected from the 
group consisting of residue positions corresponding to positions in Table 2 of Bacillus 
amyloliquefaciens subtilisin: 
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It is a further object to provide DNA sequences encoding such protease 
variants, as well as expression vectors containing such variant DNA sequences. 

Still further, another object of the invention is to provide host cells 
transformed with such vectors, as well as host cells which are capable of expressing 
5 such DNA to produce protease variants either intracellular^ or extracellularly. 

There is further provided a cleaning composition comprising a protease 
variant of the present invention. 

Additionally, there is provided an animal feed comprising a protease variant 
of the present invention. 
10 Also provided is a composition for the treatment of a textile comprising a 

protease variant of the present invention. 



BRIEF DESCRIPTION OF THE DRAWINGS 

Figs. 1 A-C depict the DNA and amino acid sequence for Bacillus 
15 amyloliquefaciens subtilisin and a partial restriction map of this gene. 

Fig. 2 depicts the conserved amino acid residues among subtilisins from 
Bacillus amyloliquefaciens (BPN)' and Bacillus lentus (wild-type). 

Figs. 3A and 3B depict the amino acid sequence of four subtilisins. The top 
line represents the amino acid sequence of subtilisin from Bacillus amyloliquefaciens 
20 subtilisin (also sometimes referred to as subtilisin BPN'). The second line depicts the 
amino acid sequence of subtilisin from Bacillus subtilis. The third line depicts the 
amino acid sequence of subtilisin from S. licheniformis. The fourth line depicts the 
amino acid sequence of subtilisin from Bacillus lentus (also referred to as subtilisin 
309 in PCT WO89/06276). The symbol * denotes the absence of specific amino acid 
25 residues as compared to subtilisin BPN'. 

Detailed Description of the Invention 

Proteases are carbonyl hydrolases which generally act to cleave peptide 
bonds of proteins or peptides. As used herein, "protease M means a naturally- 
30 occurring protease or a recombinant protease. Naturally-occurring proteases include 
a-aminoacylpeptide hydrolase, peptidylamino acid hydrolase, acylamino hydrolase, 
serine carboxypeptidase, metallocarboxypeptidase, thiol proteinase, 
carboxylproteinase and metalloproteinase. Serine, metallo. thiol and acid proteases 
are included, as well as endo and exo-proteases. 
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The present invention includes protease enzymes which are non-naturally 
occurring carbonyl hydrolase variants (protease variants) having a different 
proteolytic activity, stability, substrate specificity, pH profile and/or performance 
characteristic as compared to the precursor carbonyl hydrolase from which the 
5 amino acid sequence of the variant is derived. Specifically, such protease variants 
have an amino acid sequence not found in nature, which is derived by substitution of 
a plurality of amino acid residues of a precursor protease with different amino acids. 
The precursor protease may be a naturally-occurring protease or a recombinant 
protease. 

10 The protease variants useful herein encompass the substitution of any of the 

nineteen naturally occurring L-amino acids at the designated amino acid residue 
positions. Such substitutions can Oe made in any precursor subtiiisin (procaryotic, 
eucaryotic, mammalian, etc.). Throughout this application reference is made to 
various amino acids by way of common one - and three-letter codes. Such codes 

15 are identified in Dale, M.W. (1989), Molecular Genetics of Bacteria , John Wiley & 
Sons. Ltd., Appendix B. 

The protease variants useful herein are preferably derived from a Bacillus 
subtiiisin. More preferably, the protease variants are derived from Bacillus lentus 
subtiiisin and/or subtiiisin 309. 

20 Subtilisins are bacterial or fungal proteases which generally act to cleave 

peptide bonds of proteins or peptides. As used herein, "subtiiisin" means a naturally- 
occurring subtiiisin or a recombinant subtiiisin. A series of naturally-occurring 
subtilisins is known to be produced and often secreted by various microbial species. 
Amino acid sequences of the members of this series are not entirely homologous. 

25 However, the subtilisins in this series exhibit the same or similar type of proteolytic 
activity. This class of serine proteases shares a common amino acid sequence 
defining a catalytic triad which distinguishes them from the chymotrypsin related 
class of serine proteases. The subtilisins and chymotrypsin related serine proteases 
both have a catalytic triad comprising aspartate, histidine and serine. In the subtiiisin 

30 related proteases the relative order of these amino acids, reading from the amino to 
carboxy terminus, is aspartate-histidine-serine. In the chymotrypsin related 
proteases, the relative order, however, is histidine-aspartate-serine. Thus, subtiiisin 
herein refers to a serine protease having the catalytic triad of subtiiisin related 
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proteases. Examples include but are not limited to the subtilisins identified in Fig. 3 
herein. Generally and for purposes of the present invention, numbering of the amino 
acids in proteases corresponds to the numbers assigned to the mature Bacillus 
amyloliquefaciens subtilisin sequence presented in Fig. 1. 

5 "Recombinant subtilisin" or "recombinant protease" refer to a subtilisin or 

protease in which the DNA sequence encoding the subtilisin or protease is modified 
to produce a variant (or mutant) DNA sequence which encodes the substitution, 
deletion or insertion of one or more amino acids in the naturally-occurring amino acid 
sequence. Suitable methods to produce such modification, and which may be 

0 combined with those disclosed herein, include those disclosed in US Patent RE 

34,606, US Patent 5,204.015 and US Patent 5,185,258, U.S. Patent 5,700,676. U.S. 
Patent 5,801.038, and U.S. Patent 5,763,257. 

"Non-human subtilisins" and the DNA encoding them may be obtained from 
many procaryotic and eucaryotic organisms. Suitable examples of procaryotic 

5 organisms include gram negative organisms such as £. coli or Pseudomonas and 
gram positive bacteria such as Micrococcus or Bacillus. Examples of eucaryotic 
organisms from which subtilisin and their genes may be obtained include yeast such 
as Saccharomyces cerevisiae, fungi such as Aspergillus sp. 

A "protease variant" has an amino acid sequence which is derived from the 

0 amino acid sequence of a "precursor protease". The precursor proteases include 
naturally-occurring proteases and recombinant proteases. The amino acid sequence 
of the protease variant is "derived" from the precursor protease amino acid sequence 
by the substitution, deletion or insertion of one or more amino acids of the precursor 
amino acid sequence. Such modification is of the "precursor DNA sequence" which 

5 encodes the amino acid sequence of the precursor protease rather than 

manipulation of the precursor protease enzyme per se. Suitable methods for such 
manipulation of the precursor DNA sequence include methods disclosed herein, as 
well as methods known to those skilled in the art (see. for example, EP 0 328299, 
WO89/06279 and the US patents and applications already referenced herein). 

0 Specific substitutions of amino acids at one or more residue positions 

corresponding to residue positions selected from the group consisting of 62, 212 t 
230, 232, 252 and 257 of Bacillus amyloliquefaciens subtilisin are identified herein. 
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Preferred variants are those having combinations of substitutions at residue 
positions corresponding to positions of Bacillus amyloliquefaciens subtilisin in Table 

1. 

More preferred variants are those having combinations of substitutions at 
residue positions corresponding to positions of Bacillus amyloliquefaciens subtilisin in 
Table 2. 

Further preferred variants are those having combinations of substitutions at 
residue positions corresponding to positions of Bacillus amyloliquefaciens subtilisin in 
Table 3. 
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These amino acid position numbers refer to those assigned to the mature 
Bacillus amyloliquefaciens subtiiisin sequence presented in Fig. 1. The invention, 
however, is not limited to the mutation of this particular subtiiisin but extends to 
precursor proteases containing amino acid residues at positions which are 
5 "equivalent" to the particular identified residues in Bacillus amyloliquefaciens 
subtiiisin. In a preferred embodiment of the present invention, the precursor 
protease is Bacillus lentus subtiiisin and the substitutions are made at the equivalent 
amino acid residue positions in 6. lentus corresponding to those listed above. 

A residue (amino acid) position of a precursor protease is equivalent to a 
10 residue of Bacillus amyloliquefaciens subtiiisin if it is either homologous (i.e., 

corresponding in position in either primary or tertiary structure) or analogous to a 
specific residue or portion of that residue in Bacillus amyloliquefaciens subtiiisin (i.e., 
having the same or similar functional capacity to combine, react, or interact 
chemically). 

15 In order to establish homology to primary structure, the amino acid sequence 

of a precursor protease is directly compared to the Bacillus amyloliquefaciens 
subtiiisin primary sequence and particularly to a set of residues known to be invariant 
in subtilisins for which sequence is known. For example, Fig. 2 herein shows the 
conserved residues as between B. amyloliquefaciens subtiiisin and B. lentus 

20 subtiiisin. After aligning the conserved residues, allowing for necessary insertions 
and deletions in order to maintain alignment (i.e., avoiding the elimination of 
conserved residues through arbitrary deletion and insertion), the residues equivalent 
to particular amino acids in the primary sequence of Bacillus amyloliquefaciens 
subtiiisin are defined. Alignment of conserved residues preferably should conserve 

25 100% of such residues. However, alignment of greater than 75% or as little as 50% 
of conserved residues is also adequate to define equivalent residues. Conservation 
of the catalytic triad, Asp32/His64/Ser221 should be maintained. Siezen et al. 
(1991) Protein Eng. 4(7)719-737 shows the alignment of a large number of serine 
proteases. Siezen et al. refer to the grouping as subtilases or subtilisin-like serine 

30 proteases. 

For example, in Fig. 3, the amino acid sequence of subtiiisin from Bacillus 
amyloliquefaciens, Bacillus subtilis, Bacillus licheniformis (carlsbergensis) and 
Bacillus lentus are aligned to provide the maximum amount of homology between 
amino acid sequences. A comparison of these sequences shows that there are a 
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number of conserved residues contained in each sequence. These conserved 
residues (as between BPN' and B. lenius) are identified in Fig. 2. 

These conserved residues, thus, may be used to define the corresponding 
equivalent amino acid residues of Bacillus amyloliquefaciens subtilisin in other 
5 subtilisins such as subtilisin from Bacillus lentus (PCT Publication No. W089/06279 
published July 13, 1989), the preferred protease precursor enzyme herein, or the 
subtilisin referred to as PB92 (EP 0 328 299), which is highly homologous to the 
preferred Bacillus lentus subtilisin. The amino acid sequences of certain of these 
subtilisins are aligned in Figs. 3A and 3B with the sequence of Bacillus 

10 amyloliquefaciens subtilisin to produce the maximum homology of conserved 
residues. As can be seen, there are a number of deletions in the sequence of 
Bacillus lentus as compared to Bacillus amyloliquefaciens subtilisin. Thus, for 
example, the equivalent amino acid for Val165 in Bacillus amyloliquefaciens subtilisin 
in the other subtilisins is isoleudne for B. lentus and B. licheniformis. 

15 "Equivalent residues" may also be defined by determining homology at the 

level of tertiary structure for a precursor protease whose tertiary structure has been 
determined by x-ray crystallography. Equivalent residues are defined as those for 
which the atomic coordinates of two or more of the main chain atoms of a particular 
amino acid residue of the precursor protease and Bacillus amyloliquefaciens 

20 subtilisin (N on N, CA on CA ; C on C and O on O) are within 0.1 3nm and preferably 
0.1 nm after alignment. Alignment is achieved after the best model has been 
oriented and positioned to give the maximum overlap of atomic coordinates of non- 
hydrogen protein atoms of the protease in question to the Bacillus amyloliquefaciens 
subtilisin. The best model is the crystallographic model giving the lowest R factor for 

25 experimental diffraction data at the highest resolution available. 



l h \Fo(h)\-\Fc(h)\ 

Rfac '° r ' u\*m 



Equivalent residues which are functionally analogous to a specific residue of 
Bacillus amyloliquefaciens subtilisin are defined as those amino acids of the 
precursor protease which may adopt a conformation such that they either alter, 
30 modify or contribute to protein structure, substrate binding or catalysis in a manner 
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defined and attributed to a specific residue of the Bacillus amyloliquefaciens 
subtilisin. Further, they are those residues of the precursor protease (for which a 
tertiary structure has been obtained by x-ray crystallography) which occupy an 
analogous position to the extent that, although the main chain atoms of the given 
5 residue may not satisfy the criteria of equivalence on the basis of occupying a 

homologous position, the atomic coordinates of at least two of the side chain atoms 
of the residue lie with 0.1 3nm of the corresponding side chain atoms of Bacillus 
amyloliquefaciens subtilisin. The coordinates of the three dimensional structure of 
Bacillus amyloliquefaciens subtilisin are set forth in EPO Publication No. 0 251 446 

10 (equivalent to US Patent 5,182,204. the disclosure of which is incorporated herein by 
reference) and can be used as outlined above to determine equivalent residues on 
the level of tertiary structure. 

Some of the residues identified for substitution are conserved residues 
whereas others are not. In the case of residues which are not conserved, the 

15 substitution of one or more amino acids is limited to substitutions which produce a 
variant which has an amino acid sequence that does not correspond to one found in 
nature. In the case of conserved residues, such substitutions should not result in a 
naturally-occurring sequence. The protease variants of the present invention include 
the mature forms of protease variants, as well as the pro- and prepro-forms of such 

20 protease variants. The prepro-forms are the preferred construction since this 
facilitates the expression, secretion and maturation of the protease variants. 

"Prosequence" refers to a sequence of amino acids bound to the N-terminal 
portion of the mature form of a protease which when removed results in the 
appearance of the "mature" form of the protease. Many proteolytic enzymes are 

25 found in nature as translational proenzyme products and, in the absence of post- 
translational processing, are expressed in this fashion. A preferred prosequence for 
producing protease variants is the putative prosequence of Bacillus 
amyloliquefaciens subtilisin, although other protease prosequences may be used. 

A "signal sequence" or "presequence" refers to any sequence of amino acids 

30 bound to the N-terminal portion of a protease or to the N-terminal portion of a 
proprotease which may participate in the secretion of the mature or pro forms of the 
protease. This definition of signal sequence is a functional one, meant to include all 
those amino acid sequences encoded by the N-terminal portion of the protease gene 
which participate in the effectuation of the secretion of protease under native 
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conditions. The present invention utilizes such sequences to effect the secretion of 
the protease variants as defined herein. One possible signal sequence comprises 
the first seven amino acid residues of the signal sequence from Bacillus subtilis 
subtilisin fused to the remainder of the signal sequence of the subtilisin from Bacillus 
5 lentus (ATCC 21536). 

A "prepro" form of a protease variant consists of the mature form of the 
protease having a prosequence operably linked to the amino terminus of the 
protease and a "pre" or "signal" sequence operably linked to the amino terminus of 
the prosequence. 

10 "Expression vector" refers to a DNA construct containing a DNA sequence 

which is operably linked to a suitable control sequence capable of effecting the 
expression of said DNA in a suitable host. Such control sequences include a 
promoter to effect transcription, an optional operator sequence to control such 
transcription, a sequence encoding suitable mRNA ribosome binding sites and 

15 sequences which control termination of transcription and translation. The vector may 
be a plasmid, a phage particle, or simply a potential genomic insert. Once 
transformed into a suitable host, the vector may replicate and function independently 
of the host genome, or may, in some instances, integrate into the genome itself. In 
the present specification, "plasmid" and "vector" are sometimes used 

20 interchangeably as the plasmid is the most commonly used form of vector at present. 
However, the invention is intended to include such other forms of expression vectors 
which serve equivalent functions and which are, or become, known in the art. 

The "host cells" used in the present invention generally are procaryotic or 
eucaryotic hosts which preferably have been manipulated by the methods disclosed 

25 in US Patent RE 34,606 to render them incapable of secreting enzymatically active 
endoprotease. A preferred host cell for expressing protease is the Bacillus strain 
BG2036 which is deficient in enzymatically active neutral protease and alkaline 
protease (subtilisin). The construction of strain BG2036 is described in detail in US 
Patent 5,264,366. Other host cells for expressing protease include Bacillus subtilis 

30 1168 (also described in US Patent RE 34,606 and US Patent 5,264,366, the 

disclosure of which are incorporated herein by reference), as well as any suitable 
Bacillus strain such as fl. licheniformis, B. lentus, etc. 

Host cells are transformed or transfected with vectors constructed using 
recombinant DNA techniques. Such transformed host cells are capable of either 
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replicating vectors encoding the protease variants or expressing the desired 
protease variant. In the case of vectors which encode the pre- or prepro-form of the 
protease variant, such variants, when expressed, are typically secreted from the host 
ceil into the host cell medium. 

5 M Operably linked. " when describing the relationship between two DNA 

regions, simply means that they are functionally related to each other. For example, 
a presequence is operably linked to a peptide if it functions as a signal sequence, 
participating in the secretion of the mature form of the protein most probably 
involving cleavage of the signal sequence. A promoter is operably linked to a coding 

10 sequence if it controls the transcription of the sequence; a ribosome binding site is 
operably linked to a coding sequence if it is positioned so as to permit translation. 

The genes encoding the naturally-occurring precursor protease may be 
obtained in accord with the general methods known to those skilled in the art. The 
methods generally comprise synthesizing labeled probes having putative seauences 

15 encoding regions of the protease of interest, preparing genomic libraries from 
organisms expressing the protease, and screening the libraries for the gene of 
interest by hybridization to the probes. Positively hybridizing clones are then 
mapped and sequenced. 

The cloned protease is then used to transform a host cell in order to express 

20 the protease. The protease gene is then iigated into a high copy number piasmid. 
This piasmid replicates in hosts in the sense that it contains the well-known elements 
necessary for piasmid replication: a promoter operably linked to the gene in 
question (which may be supplied as the gene ; s own homologous promoter if it is 
recognized, i.e., transcribed, by the host), a transcription termination and 

25 poiyadenylation region (necessary for stability of the mRNA transcribed by the host 
from the protease gene in certain eucaryotic host cells) which is exogenous or is 
supplied by the endogenous terminator region of the protease gene and, desirably, a 
selection gene such as an antibiotic resistance gene that enables continuous cultural 
maintenance of plasmid-infected host cells by growth in antibiotic-containing media. 

30 High copy number plasmids also contain an origin of replication for the host, thereby 
enabling large numbers of plasmids to be generated in the cytoplasm without 
chromosomal limitations. However, it is within the scope herein to integrate multiple 
copies of the protease gene into host genome. This is facilitated by procaryotic and 
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eucaryotic organisms which are particularly susceptible to homologous 
recombination. 

The gene can be a natural B. lenius gene. Alternatively, a synthetic gene 
encoding a naturally-occurring or mutant precursor protease may be produced. In 

5 such an approach, the DNA and/or amino acid sequence of the precursor protease is 
determined. Multiple, overlapping synthetic single-stranded DNA fragments are 
thereafter synthesized, which upon hybridization and ligation produce a synthetic 
DNA encoding the precursor protease. An example of synthetic gene construction is 
set forth in Example 3 of US Patent 5,204,015. the disclosure of which is 

10 incorporated herein by reference. 

Once the naturally-occurring or synthetic precursor protease gene has been 
cloned, a number of modifications are undertaken to enhance the use of the gene 
beyond synthesis of the naturally-occurring precursor protease. Such modifications 
include the production of recombinant proteases as disclosed in US Patent RE 

15 34,606 and EPO Publication No. 0 251 446 and the production of protease variants 
described herein. 

The following cassette mutagenesis method may be used to facilitate the 
construction of the protease variants of the present invention, although other 
methods may be used. First, the naturally-occurring gene encoding the protease is 

20 obtained and sequenced in whole or in part. Then the sequence is scanned for a 
point at which it is desired to make a mutation (deletion, insertion or substitution) of 
one or more amino acids in the encoded enzyme. The sequences flanking this point 
are evaluated for the presence of restriction sites for replacing a short segment of 
the gene with an oligonucleotide pool which when expressed will encode various 

25 mutants. Such restriction sites are preferably unique sites within the protease gene 
so as to facilitate the replacement of the gene segment. However, any convenient 
restriction site which is not overly redundant in the protease gene may be used, 
provided the gene fragments generated by restriction digestion can be reassembled 
in proper sequence. If restriction sites are not present at locations within a 

30 convenient distance from the selected point (from 10 to 15 nucleotides), such sites 
are generated by substituting nucleotides in the gene in such a fashion that neither 
the reading frame nor the amino acids encoded are changed in the final construction. 
Mutation of the gene in order to change its sequence to conform to the desired 
sequence is accomplished by M13 primer extension in accord with generally known 
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methods. The task of locating suitable flanking regions and evaluating the needed 
changes to arrive at two convenient restriction site sequences is made routine by the 
redundancy of the genetic code, a restriction enzyme map of the gene and the large 
number of different restriction enzymes. Note that if a convenient flanking restriction 
5 site is available, the above method need be used only in connection with the flanking 
region which does not contain a site. 

Once the naturally-occurring DNA or synthetic DNA is cloned, the restriction 
sites flanking the positions to be mutated are digested with the cognate restriction 
enzymes and a plurality of end termini-compiementary oligonucleotide cassettes are 

10 ligated into the gene. The mutagenesis is simplified by this method because all of 
the oligonucleotides can be synthesized so as to have the same restriction sites, and 
no synthetic linkers are necessary to create the restriction sites. 

As used herein, proteolytic activity is defined as the rate of hydrolysis of 
peptide bonds per milligram of active enzyme. Many well known procedures exist for 

15 measuring proteolytic activity (K. M. Kalisz. "Microbial Proteinases," Advances in 
Biochemical Engineering/Biotechnology , A. Fiechter ed., 1988). In addition to or as 
an alternative to modified proteolytic activity, the variant enzymes of the present 
invention may have other modified properties such as K mt k^, k^/K^ ratio and/or 
modified substrate specificity and/or modified pH activity profile. These enzymes 

20 can be tailored for the particular substrate which is anticipated to be present, for 
example, in the preparation of peptides or for hydrolytic processes such as laundry 
uses. 

In one aspect of the invention, the objective is to secure a variant protease 
having altered, preferably improved wash performance as compared to a precursor 
25 protease in at least one detergent formulation and or under at least one set of wash 
conditions. 

There is a variety of wash conditions including varying detergent 
formulations, wash water volume, wash water temperature and length of wash time 
that a protease variant might be exposed to. For example, detergent formulations 
30 used in different areas have different concentrations of their relevant components 
present in the wash water. For example, a European detergent typically has about 
4500-5000 ppm of detergent components in the wash water while a Japanese 
detergent typically has approximately 667 ppm of detergent components in the wash 



• t 
» m 

- WO 99/20769 



PCT/US98/22500 



-45- 

water. In North America, particularly the United States, a detergent typically has 
about 975 ppm of detergent components present in the wash water. 

A low detergent concentration system includes detergents where less than 
about 800 ppm of detergent components are present in the wash water. Japanese 
5 detergents are typically considered low detergent concentration system as they have 
approximately 667 ppm of detergent components present in the wash water. 

A medium detergent concentration includes detergents where between about 
800 ppm and about 2000ppm of detergent components are present in the wash 
water. North American detergents are generally considered to be medium detergent 
10 concentration systems as they have approximately 975 ppm of detergent 

components present in the wash water. Brazil typically has approximately 1500 ppm 
of detergent components present in the wash water. 

A high detergent concentration system includes detergents where greater 
than about 2000 ppm of detergent components are present in the wash water. 
15 European detergents are generally considered to be high detergent concentration 
systems as they have approximately 4500-5000 ppm of detergent components in the 
wash water. 

Latin American detergents are generally high suds phosphate builder 
detergents and the range of detergents used in Latin America can fall in both the 

20 medium and high detergent concentrations as they range from 1500 ppm to 6000 
ppm of detergent components in the wash water. As mentioned above, Brazil 
typically has approximately 1500 ppm of detergent components present in the wash 
water. However, other high suds phosphate builder detergent geographies, not 
limited to other Latin American countries, may have high detergent concentration 

25 systems up to about 6000 ppm of detergent components present in the wash water. 
In light of the foregoing, it is evident that concentrations of detergent 
compositions in typical wash solutions throughout the world varies from less than 
about 800 ppm of detergent composition ("low detergent concentration 
geographies"), for example about 667 ppm in Japan, to between about 800 ppm to 

30 about 2000 ppm ("medium detergent concentration geographies"), for example about 
975 ppm in U.S. and about 1500 ppm in Brazil, to greater than about 2000 ppm 
("high detergent concentration geographies"), for example about 4500 ppm to about 
5000 ppm in Europe and about 6000 ppm in high suds phosphate builder 
geographies. 



WO 99/20769 



PCT/US98/22500 



-46- 

The concentrations of the typical wash solutions are determined empirically. 
For example, in the U.S., a typical washing machine holds a volume of about 64.4 L 
of wash solution. Accordingly, in order to obtain a concentration of about 975 ppm of 
detergent within the wash solution about 62.79 g of detergent composition must be 
5 added to the 64.4 L of wash solution. This amount is the typical amount measured 
into the wash water by the consumer using the measuring cup provided with the 
detergent. 

As a further example, different geographies use different wash temperatures. 
The temperature of the wash water in Japan is typically less than that used in 
10 Europe. 

Accordingly one aspect of the present invention includes a protease variant 
that shows improved wash performance in at least one set of wash conditions. 

In another aspect of the invention, it has been determined that substitution of 
an amino acid at one or more residue positions corresponding to residue positions 
15 selected from the group consisting of 62, 212, 230, 232, 252 and 257 of Bacillus 
amyloliquefaciens subtilisin are important in improving the wash performance of the 
enzyme. 

These substitutions are preferably made in Bacillus lentus (recombinant or 
native-type) subtilisin, although the substitutions may be made in any Bacillus 
20 protease. 

Based on the screening results obtained with the variant proteases, the noted 
mutations in Bacillus amyloliquefaciens subtilisin are important to the proteolytic 
activity, performance and/or stability of these enzymes and the cleaning or wash 
performance of such variant enzymes. 

25 Many of the protease variants of the invention are useful in formulating 

various detergent compositions or personal care formulations such as shampoos or 
lotions. A number of known compounds are suitable surfactants useful in 
compositions comprising the protease mutants of the invention. These include 
nonionic, anionic, cationic, or zwitterionic detergents, as disclosed in US 4,404,128 

30 to Barry J. Anderson and US 4,261,868 to Jiri Flora, et al. A suitable detergent 
formulation is that described in Example 7 of US Patent 5,204,015 (previously 
incorporated by reference). The art is familiar with the different formulations which 
can be used as cleaning compositions. In addition to typical cleaning compositions, 
it is readily understood that the protease variants of the present invention may be 
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used for any purpose that native or wild-type proteases are used. Thus, these 
variants can be used, for example, in bar or liquid soap applications, dishcare 
formulations, contact lens cleaning solutions or products, peptide hydrolysis, waste 
treatment, textile applications, as fusion-cleavage enzymes in protein production, 
5 etc. The variants of the present invention may comprise enhanced performance in a 
detergent composition (as compared to the precursor). As used herein, enhanced 
performance in a detergent is defined as increasing cleaning of certain enzyme 
sensitive stains such as grass or blood, as determined by usual evaluation after a 
standard wash cycle. 

10 Proteases of the invention can be formulated into known powdered and liquid 

detergents having pH between 6.5 and 12.0 at levels of about 0.01 to about 5% 
(preferably 0.1% to 0.5%) by weight. These detergent cleaning compositions can 
also include other enzymes such as known proteases, amylases, cellulases. lipases 
or endoglycosidases. as well as builders and stabilizers. 

15 The addition of proteases of the invention to conventional cleaning 

compositions does not create any special use limitation. In other words, any 
temperature and pH suitable for the detergent is also suitable for the present 
compositions as long as the pH is within the above range, and the temperature is 
below the described protease's denaturing temperature. In addition, proteases of 

20 the invention can be used in a cleaning composition without detergents, again either 
alone or in combination with builders and stabilizers. 

The present invention also relates to cleaning compositions containing the 
protease variants of the invention. The cieaning compositions may additionally 
contain additives which are commonly used in cleaning compositions. These can be 

25 selected from, but not limited to, bleaches, surfactants, builders, enzymes and 

bleach catalysts. It would be readily apparent to one of ordinary skill in the art what 
additives are suitable for inclusion into the compositions. The list provided herein is 
by no means exhaustive and should be only taken as examples of suitable additives. 
It will also be readily apparent to one of ordinary skill in the art to only use those 

30 additives which are compatible with the enzymes and other components in the 
composition, for example, surfactant. 

When present, the amount of additive present in the cleaning composition is 
from about 0.01% to about 99.9%. preferably about 1% to about 95%, more 
preferably about 1% to about 80%. 
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The variant proteases of the present invention can be included in animal feed 
such as part of animal feed additives as described in, for example. US 5,612,055: 
US 5,314,692; and US 5,147,642. 

One aspect of the invention is a composition for the treatment of a textile that 
5 includes variant proteases of the present invention. The composition can be used to 
treat for example silk or wool as described in publications such as RD 216,034: EP 
1 34,267; US 4,533.359: and EP 344,259. 

The following is presented by way of example and is not to be construed as a 
limitation to the scope of the claims. 
0 All publications and patents referenced herein are hereby incorporated by 

reference in their entirety. 

Example 1 

A large number of protease variants were produced and purified using 
5 methods well known in the art. All mutations were made in Bacillus lentus GG36 
subtilisin. The variants are shown in Table 4. 
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Example 2 

A large number of the protease variants produced in Example 1 were tested 
for performance in two types of detergent and wash conditions using a microswatch 
5 assay described in "An improved method of assaying for a preferred enzyme and/or 
preferred detergent composition", U.S. Serial No. 60/068,796. 

Table 5 lists the variant proteases assayed and the results of testing in two 
different detergents. For coiumn A, the detergent was 0.67 g/l filtered Ariel Ultra 
(Procter & Gamble. Cincinnati. OH, USA), in a solution containing 3 grains per gallon 
10 mixed Ca 2+ /Mg 2+ hardness, and 0.3 ppm enzyme was used in each well at 20°C. 
For column B, the detergent was 3.38 g/l filtered Ariel Futur (Procter & Gamble, 

Cincinnati, OH, USA), in a solution containing 15 grains per gallon mixed Ca 2+ /Mg 2+ 
hardness, and 0.3 ppm enzyme was used in each well at 40°C. 
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Example 3 

Table 6 lists the variant proteases assayed from Example 1 and the results of 
testing in four different detergents. The same performance tests as in Example 2 
5 were done on the noted variant proteases with the following detergents. For column 
A, the detergent was 0.67 g/l filtered Ariel Ultra (Procter & Gamble, Cincinnati, OH, 

USA), in a solution containing 3 grains per gallon mixed Ca 2+ /Mg 2+ hardness, and 
0.3 ppm enzyme was used in each well at 20°C. For coiumn B, the detergent was 
3.38 g/l filtered Ariel Futur (Procter & Gamble. Cincinnati. OH, USA), in a solution 

10 containing 15 grains per gallon mixed Ca 2+ /Mg 2+ hardness, and 0.3 ppm enzyme 
was used in each well at 40°C. For column C, 3.5g/l HSP1 detergent (Procter & 
Gamble, Cincinnati. OH, USA), in a solution containing 8 grains per gallon mixed 

Ca 2+ /Mg 2+ hardness, and 0.3 ppm enzyme was used in each well at 20°C. For 
column D, 1.5 ml/l Tide KT detergent (Procter & Gamble. Cincinnati, OK USA), in a 

15 solution containing 3 grains per gallon mixed Ca 2+ /Mg 2+ hardness, and 0.3 ppm 
enzyme was used in each well at 20°C. 
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WHAT IS CLAIMED: 

1 . A protease variant comprising replacing an amino acid at one or more 
residue positions corresponding to residue positions selected from the group 

5 consisting of 62, 212, 230, 232, 252 and 257 of Bacillus amyloliquefaciens subtilisin. 

2. The protease variant according to claim 1 which is derived from a Bacillus 
subtilisin. 

10 3. The protease variant according to claim 2 which is derived from Bacillus 

lentus subtilisin. 

4. A DNA encoding a protease variant of claim 1 . 

15 5. An expression vector encoding the DNA of claim 4. 

6. A host cell transformed with the expression vector of claim 5. 

7. A cleaning composition comprising the protease variant of claim 1. 

20 

8. An animal feed comprising the protease variant of claim 1 . 

9. A composition for treating a textile comprising the protease variant of 
claim 1. 

25 

10. The protease variant according to claim 1 comprising a substitution set 
selected from the group consisting of residue positions corresponding to positions in 
Table 1 of Bacillus amyloliquefaciens subtilisin. 

30 11. The protease variant according to claim 10 comprising a substitution set 

selected from the group consisting of residue positions corresponding to positions in 
Table 2 of Bacillus amyloliquefaciens subtilisin. 
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12. The protease variant according to claim 10 comprising a substitution set 
selected from the group consisting of residue positions corresponding to positions in 
Table 3 of Bacillus amyloliquefaciens subtilisin. 
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CONSERVED RESIDUES IN SUBTILISINS FROM 
BACILLUS AMYLOLIOUEFACIENS 

1 10 20 
AQSVP.G A P A . H . .G 

21 30 40 

.TGS.VKVAV.D.G. . . .HP 

41 50 60 
DL. . .GGAS.VP QD 

61 70 80 

. N . HGTHVAGT . AALNNS IG 

81 90 100 

VLGVAPSA. LYAVKVLGA. G 

101 HO 120 

S G . . S . L . .G.EWA.N. . . . 

121 130 140 
V.H.SLG.PS.S A. . 

141 150 160 
GV.VVAA.GN.G. . . 

161 170 180 
YP. .Y. . . . A V G A . 

181 190 200 

D. . N . .ASPS. .G. . L D . .A 

201 210 220 

PGV. .QST.PG. . Y . . . N G T 

221 230 240 

SMA . PHVAGAAAL . . . K . . . 

241 250 260 

W . . . Q . R • . L . N T . . . L G • . 

• 261 270 

. .YG.GL.N. .AA. . 
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